首页> 外文OA文献 >Adaptive Basis Selection for Exponential Family Smoothing Splines with Application in Joint Modeling of Multiple Sequencing Samples
【2h】

Adaptive Basis Selection for Exponential Family Smoothing Splines with Application in Joint Modeling of Multiple Sequencing Samples

机译:指数族平滑样条的自适应基础选择   多序列样品联合建模的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Second-generation sequencing technologies have replaced array-basedtechnologies and become the default method for genomics and epigenomicsanalysis. Second-generation sequencing technologies sequence tens of millionsof DNA/cDNA fragments in parallel. After the resulting sequences (short reads)are mapped to the genome, one gets a sequence of short read counts along thegenome. Effective extraction of signals in these short read counts is the keyto the success of sequencing technologies. Nonparametric methods, in particularsmoothing splines, have been used extensively for modeling and processingsingle sequencing samples. However, nonparametric joint modeling of multiplesecond-generation sequencing samples is still lacking due to computationalcost. In this article, we develop an adaptive basis selection method forefficient computation of exponential family smoothing splines for modelingmultiple second-generation sequencing samples. Our adaptive basis selectiongives a sparse approximation of smoothing splines, yielding a lower-dimensionaleffective model space for a more scalable computation. The asymptotic analysisshows that the effective model space is rich enough to retain essentialfeatures of the data. Moreover, exponential family smoothing spline modelscomputed via adaptive basis selection are shown to have good statisticalproperties, e.g., convergence at the same rate as that of full basisexponential family smoothing splines. The empirical performance is demonstratedthrough simulation studies and two second-generation sequencing data examples.
机译:第二代测序技术已经取代了基于阵列的技术,并成为基因组学和表观基因组学分析的默认方法。第二代测序技术可对数千万个DNA / cDNA片段进行并行测序。将产生的序列(短读)定位到基因组后,沿基因组获得一系列短读计数。有效读取这些短读计数中的信号是测序技术成功的关键。非参数方法,特别是平滑样条,已广泛用于建模和处理单个测序样品。然而,由于计算成本的原因,仍然缺乏多参数第二代测序样品的非参数联合建模。在本文中,我们开发了一种自适应的基础选择方法,可以高效地计算指数族平滑样条,以建模多个第二代测序样本。我们的自适应基础选择给出了平滑样条曲线的稀疏近似,从而产生了维数较低的有效模型空间,可进行更可扩展的计算。渐近分析表明,有效模型空间足够丰富,可以保留数据的基本特征。此外,通过自适应基础选择计算出的指数族平滑样条模型具有良好的统计特性,例如,收敛速度与全基指数族平滑样条的收敛速度相同。通过仿真研究和两个第二代测序数据示例证明了经验性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号